1,295 research outputs found

    Indirect Match Highlights Detection with Deep Convolutional Neural Networks

    Full text link
    Highlights in a sport video are usually referred as actions that stimulate excitement or attract attention of the audience. A big effort is spent in designing techniques which find automatically highlights, in order to automatize the otherwise manual editing process. Most of the state-of-the-art approaches try to solve the problem by training a classifier using the information extracted on the tv-like framing of players playing on the game pitch, learning to detect game actions which are labeled by human observers according to their perception of highlight. Obviously, this is a long and expensive work. In this paper, we reverse the paradigm: instead of looking at the gameplay, inferring what could be exciting for the audience, we directly analyze the audience behavior, which we assume is triggered by events happening during the game. We apply deep 3D Convolutional Neural Network (3D-CNN) to extract visual features from cropped video recordings of the supporters that are attending the event. Outputs of the crops belonging to the same frame are then accumulated to produce a value indicating the Highlight Likelihood (HL) which is then used to discriminate between positive (i.e. when a highlight occurs) and negative samples (i.e. standard play or time-outs). Experimental results on a public dataset of ice-hockey matches demonstrate the effectiveness of our method and promote further research in this new exciting direction.Comment: "Social Signal Processing and Beyond" workshop, in conjunction with ICIAP 201

    Keller--Osserman conditions for diffusion-type operators on Riemannian Manifolds

    Get PDF
    In this paper we obtain generalized Keller-Osserman conditions for wide classes of differential inequalities on weighted Riemannian manifolds of the form Lu≥b(x)f(u)ℓ(∣∇u∣)L u\geq b(x) f(u) \ell(|\nabla u|) and Lu≥b(x)f(u)ℓ(∣∇u∣)−g(u)h(∣∇u∣)L u\geq b(x) f(u) \ell(|\nabla u|) - g(u) h(|\nabla u|), where LL is a non-linear diffusion-type operator. Prototypical examples of these operators are the pp-Laplacian and the mean curvature operator. While we concentrate on non-existence results, in many instances the conditions we describe are in fact necessary for non-existence. The geometry of the underlying manifold does not affect the form of the Keller-Osserman conditions, but is reflected, via bounds for the modified Bakry-Emery Ricci curvature, by growth conditions for the functions bb and ℓ\ell. We also describe a weak maximum principle related to inequalities of the above form which extends and improves previous results valid for the \vp-Laplacian

    On the 1/H1/H-flow by pp-Laplace approximation: new estimates via fake distances under Ricci lower bounds

    Full text link
    In this paper we show the existence of weak solutions w:M→Rw : M \rightarrow \mathbb{R} of the inverse mean curvature flow starting from a relatively compact set (possibly, a point) on a large class of manifolds satisfying Ricci lower bounds. Under natural assumptions, we obtain sharp estimates for the growth of ww and for the mean curvature of its level sets, that are well behaved with respect to Gromov-Hausdorff convergence. The construction follows R. Moser's approximation procedure via the pp-Laplace equation, and relies on new gradient and decay estimates for pp-harmonic capacity potentials, notably for the kernel Gp\mathcal{G}_p of Δp\Delta_p. These bounds, stable as p→1p \rightarrow 1, are achieved by studying fake distances associated to capacity potentials and Green kernels. We conclude by investigating some basic isoperimetric properties of the level sets of ww.Comment: 61 pages. Revised version. Section 3.2 (properness under volume doubling and weak Poincar\'e inequalities, p.41-45) was rewritten, and the main Theorems 1.4 and 4.6 changed accordingl

    Ricci almost solitons

    Get PDF
    We introduce a natural extension of the concept of gradient Ricci soliton: the Ricci almost soliton. We provide existence and rigidity results, we deduce a-priori curvature estimates and isolation phenomena, and we investigate some topological properties. A number of differential identities involving the relevant geometric quantities are derived. Some basic tools from the weighted manifold theory such as general weighted volume comparisons and maximum principles at infinity for diffusion operators are discussed

    F-formation Detection: Individuating Free-standing Conversational Groups in Images

    Full text link
    Detection of groups of interacting people is a very interesting and useful task in many modern technologies, with application fields spanning from video-surveillance to social robotics. In this paper we first furnish a rigorous definition of group considering the background of the social sciences: this allows us to specify many kinds of group, so far neglected in the Computer Vision literature. On top of this taxonomy, we present a detailed state of the art on the group detection algorithms. Then, as a main contribution, we present a brand new method for the automatic detection of groups in still images, which is based on a graph-cuts framework for clustering individuals; in particular we are able to codify in a computational sense the sociological definition of F-formation, that is very useful to encode a group having only proxemic information: position and orientation of people. We call the proposed method Graph-Cuts for F-formation (GCFF). We show how GCFF definitely outperforms all the state of the art methods in terms of different accuracy measures (some of them are brand new), demonstrating also a strong robustness to noise and versatility in recognizing groups of various cardinality.Comment: 32 pages, submitted to PLOS On

    The Visual Social Distancing Problem

    Get PDF
    One of the main and most effective measures to contain the recent viral outbreak is the maintenance of the so-called Social Distancing (SD). To comply with this constraint, workplaces, public institutions, transports and schools will likely adopt restrictions over the minimum inter-personal distance between people. Given this actual scenario, it is crucial to massively measure the compliance to such physical constraint in our life, in order to figure out the reasons of the possible breaks of such distance limitations, and understand if this implies a possible threat given the scene context. All of this, complying with privacy policies and making the measurement acceptable. To this end, we introduce the Visual Social Distancing (VSD) problem, defined as the automatic estimation of the inter-personal distance from an image, and the characterization of the related people aggregations. VSD is pivotal for a non-invasive analysis to whether people comply with the SD restriction, and to provide statistics about the level of safety of specific areas whenever this constraint is violated. We then discuss how VSD relates with previous literature in Social Signal Processing and indicate which existing Computer Vision methods can be used to manage such problem. We conclude with future challenges related to the effectiveness of VSD systems, ethical implications and future application scenarios.Comment: 9 pages, 5 figures. All the authors equally contributed to this manuscript and they are listed by alphabetical order. Under submissio

    MX-LSTM: mixing tracklets and vislets to jointly forecast trajectories and head poses

    Get PDF
    Recent approaches on trajectory forecasting use tracklets to predict the future positions of pedestrians exploiting Long Short Term Memory (LSTM) architectures. This paper shows that adding vislets, that is, short sequences of head pose estimations, allows to increase significantly the trajectory forecasting performance. We then propose to use vislets in a novel framework called MX-LSTM, capturing the interplay between tracklets and vislets thanks to a joint unconstrained optimization of full covariance matrices during the LSTM backpropagation. At the same time, MX-LSTM predicts the future head poses, increasing the standard capabilities of the long-term trajectory forecasting approaches. With standard head pose estimators and an attentional-based social pooling, MX-LSTM scores the new trajectory forecasting state-of-the-art in all the considered datasets (Zara01, Zara02, UCY, and TownCentre) with a dramatic margin when the pedestrians slow down, a case where most of the forecasting approaches struggle to provide an accurate solution.Comment: 10 pages, 3 figures to appear in CVPR 201

    Forecasting People Trajectories and Head Poses by Jointly Reasoning on Tracklets and Vislets

    Full text link
    In this work, we explore the correlation between people trajectories and their head orientations. We argue that people trajectory and head pose forecasting can be modelled as a joint problem. Recent approaches on trajectory forecasting leverage short-term trajectories (aka tracklets) of pedestrians to predict their future paths. In addition, sociological cues, such as expected destination or pedestrian interaction, are often combined with tracklets. In this paper, we propose MiXing-LSTM (MX-LSTM) to capture the interplay between positions and head orientations (vislets) thanks to a joint unconstrained optimization of full covariance matrices during the LSTM backpropagation. We additionally exploit the head orientations as a proxy for the visual attention, when modeling social interactions. MX-LSTM predicts future pedestrians location and head pose, increasing the standard capabilities of the current approaches on long-term trajectory forecasting. Compared to the state-of-the-art, our approach shows better performances on an extensive set of public benchmarks. MX-LSTM is particularly effective when people move slowly, i.e. the most challenging scenario for all other models. The proposed approach also allows for accurate predictions on a longer time horizon.Comment: Accepted at IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019. arXiv admin note: text overlap with arXiv:1805.0065

    Recognition self-awareness for active object recognition on depth images

    Get PDF
    We propose an active object recognition framework that introduces the recognition self-awareness, which is an intermediate level of reasoning to decide which views to cover during the object exploration. This is built first by learning a multi-view deep 3D object classifier; subsequently, a 3D dense saliency volume is generated by fusing together single-view visualization maps, these latter obtained by computing the gradient map of the class label on different image planes. The saliency volume indicates which object parts the classifier considers more important for deciding a class. Finally, the volume is injected in the observation model of a Partially Observable Markov Decision Process (POMDP). In practice, the robot decides which views to cover, depending on the expected ability of the classifier to discriminate an object class by observing a specific part. For example, the robot will look for the engine to discriminate between a bicycle and a motorbike, since the classifier has found that part as highly discriminative. Experiments are carried out on depth images with both simulated and real data, showing that our framework predicts the object class with higher accuracy and lower energy consumption than a set of alternatives
    • …
    corecore